Skip to content

Speedup PP rules cube construction.#583

Closed
pp-mo wants to merge 2 commits into
SciTools:masterfrom
pp-mo:pp_speedup_addcoord
Closed

Speedup PP rules cube construction.#583
pp-mo wants to merge 2 commits into
SciTools:masterfrom
pp-mo:pp_speedup_addcoord

Conversation

@pp-mo

@pp-mo pp-mo commented Jun 28, 2013

Copy link
Copy Markdown
Member

Defines + uses lower-level versions of cube.add_(dim/aux)_coord.
Produces a speedup to cubes construction in import, with a small improvement (~10%) to overall load speeds.

@pp-mo

pp-mo commented Jun 28, 2013

Copy link
Copy Markdown
Member Author

Context:

Analysis of building cubes with rules gave roughly these proportions of time percentage for a simple iris.load_raw call....
15.5 % : file loading + PPField construction
15.6% : rules evaluation (= condition testing)
37.7% : rule actions execution (= building cube components like coords + attributes)
31.2% : rules result processing (= adding cube components into the cube)

This PR addresses the last of these areas.

@pp-mo

pp-mo commented Jun 28, 2013

Copy link
Copy Markdown
Member Author

Test results:

Testing with a standard test-data file...
Test code (in /net/home/h05/itpp/Iris/odds/pp_speedtest_2.sh) :

    #!/usr/bin/env python2.7
    from datetime import datetime as dt
    import shutil
    import tempfile
    import os.path

    import iris
    import iris.tests

    source_filepath = iris.tests.get_data_path(('PP','COLPEX','theta_and_orog_subset.pp'))
    with tempfile.NamedTemporaryFile(suffix='.pp', delete=False) as f:
      temp_local_filename = f.name
    shutil.copyfile(source_filepath, temp_local_filename)

    try:
      for i_try in range(4):
        t1=dt.now()
        iris.load_raw(temp_local_filename)
        t2=dt.now()
        print 'LOAD TOOK:',(t2-t1).total_seconds()
    finally:
      os.remove(temp_local_filename)

Results:

    itpp@eld238: /net/home/h05/itpp/git/iris/iris_main > git checkout upstream/master
      . . .
    HEAD is now at 9642173... Merge pull request #570 from pelson/new_sample_data_update
    itpp@eld238: /net/home/h05/itpp/git/iris/iris_main > /home/h05/itpp/Iris/odds/pp_speedtest_2.sh
    LOAD TOOK: 2.396032
    LOAD TOOK: 2.388618
    LOAD TOOK: 2.356158
    LOAD TOOK: 2.354932
    itpp@eld238: /net/home/h05/itpp/git/iris/iris_main > git checkout pp_speedup_addcoord          
    Previous HEAD position was 9642173... Merge pull request #570 from pelson/new_sample_data_update
    Switched to branch 'pp_speedup_addcoord'
    itpp@eld238: /net/home/h05/itpp/git/iris/iris_main > /home/h05/itpp/Iris/odds/pp_speedtest_2.sh
    LOAD TOOK: 2.25511
    LOAD TOOK: 2.217443
    LOAD TOOK: 2.251518
    LOAD TOOK: 2.23761
    itpp@eld238: /net/home/h05/itpp/git/iris/iris_main > 

Comment thread lib/iris/fileformats/rules.py Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not your doing but this needs fixing.

@pp-mo

pp-mo commented Jul 3, 2013

Copy link
Copy Markdown
Member Author

Travis test failed due to CF standard-namers-table glitch (believed now fixed).
Re-pushed with changed comment to re-test.
TODO: rebase (squish) when required.

@cpelley

cpelley commented Jul 9, 2013

Copy link
Copy Markdown

discussion relating to coord-dim association is ongoing and may effect this PR, see #587

@cpelley

cpelley commented Jul 11, 2013

Copy link
Copy Markdown

I see nobody is assigned, I'm happy to review, change PR description to reflect this

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks whether the coordinate is already 'used', however the more information checked against, the more often that it will correctly determine the answer, thus bypass the more hungry coordinate search (cube.coord). Why not also add 'var_name' here to catch more cases?

@rhattersley

Copy link
Copy Markdown
Member

I think we can achieve the same optimisation without accessing private methods or adding to the API ...

Instead of adding coordinates on to an existing cube, we modify the field-to-cube conversion to collect all the coordinates it makes and pass them to the Cube constructor via the existing dim_coords_and_dims and aux_coords_and_dims arguments. Then we can add the quick-lookup coordinate uniqueness checks to the Cube constructor. Win-win?

@rhattersley

Copy link
Copy Markdown
Member

I think we can achieve the same optimisation without accessing private methods or adding to the API ...

See #637 (and SciTools/iris-code-generators#8).

@rhattersley

Copy link
Copy Markdown
Member

I'm closing this because (thanks to hindsight) we've basically extracted the benefit via #637. If there are remaining benefits which have not been addressed elsewhere then please do submit a new PR.

@rhattersley rhattersley closed this Aug 6, 2013
@pp-mo pp-mo deleted the pp_speedup_addcoord branch March 14, 2016 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants